Population-Scale Sequencing Data Enable Precise Estimates of Y-STR Mutation Rates.

نویسندگان

  • Thomas Willems
  • Melissa Gymrek
  • G David Poznik
  • Chris Tyler-Smith
  • Yaniv Erlich
چکیده

Short tandem repeats (STRs) are mutation-prone loci that span nearly 1% of the human genome. Previous studies have estimated the mutation rates of highly polymorphic STRs by using capillary electrophoresis and pedigree-based designs. Although this work has provided insights into the mutational dynamics of highly mutable STRs, the mutation rates of most others remain unknown. Here, we harnessed whole-genome sequencing data to estimate the mutation rates of Y chromosome STRs (Y-STRs) with 2-6 bp repeat units that are accessible to Illumina sequencing. We genotyped 4,500 Y-STRs by using data from the 1000 Genomes Project and the Simons Genome Diversity Project. Next, we developed MUTEA, an algorithm that infers STR mutation rates from population-scale data by using a high-resolution SNP-based phylogeny. After extensive intrinsic and extrinsic validations, we harnessed MUTEA to derive mutation-rate estimates for 702 polymorphic STRs by tracing each locus over 222,000 meioses, resulting in the largest collection of Y-STR mutation rates to date. Using our estimates, we identified determinants of STR mutation rates and built a model to predict rates for STRs across the genome. These predictions indicate that the load of de novo STR mutations is at least 75 mutations per generation, rivaling the load of all other known variant types. Finally, we identified Y-STRs with potential applications in forensics and genetic genealogy, assessed the ability to differentiate between the Y chromosomes of father-son pairs, and imputed Y-STR genotypes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chromosome-wide characterization of Y-STR mutation rates using ultra-deep genealogies

Although the utility of short tandem repeats on the Y-chromosome (Y-STRs) has long been recognized and leveraged in forensics, genealogy and paternity testing, the bulk of these applications have relied on only a few dozen loci identified as having remarkably high mutation rates. Recent efforts have expanded the set of Y-STRs with known mutation rates to two hundred markers, but the limited thr...

متن کامل

Maximum likelihood estimation of locus-specific mutation rates in Y-chromosome short tandem repeats

MOTIVATION Y-chromosome short tandem repeats (Y-STRs) are widely used for population studies, forensic purposes and, potentially, the study of disease, therefore knowledge of their mutation rate is valuable. Here we show a novel method for estimation of site-specific Y-STR mutation rates from partial phylogenetic information, via the maximum likelihood framework. RESULTS Given Y-STR data clas...

متن کامل

Precision and accuracy of divergence time estimates from STR and SNPSTR variation.

Inference of intraspecific population divergence patterns typically requires genetic data for molecular markers with relatively high mutation rates. Microsatellites, or short tandem repeat (STR) polymorphisms, have proven informative in many such investigations. These markers are characterized, however, by high levels of homoplasy and varying mutational properties, often leading to inaccurate i...

متن کامل

Population-Scale Sequencing Data Enables Precise Estimates of Y-STR Mutation Rates

1 New York Genome Center, New York, NY 10013, USA 2 Computational and Systems Biology Program, MIT, Cambridge, MA 02139, USA 3 Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, MA 02139, USA 4 Harvard-MIT Division of Health Sciences and Technology, MIT, Cambridge, MA 02139, USA 5 Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge,...

متن کامل

Comparison of Y-chromosomal lineage dating using either evolutionary or genealogical Y-STR mutation rates

We have compared the Y chromosomal lineage dating between sequence data and commonly used Y-SNP plus Y-STR data. The coalescent times estimated using evolutionary Y-STR mutation rates correspond best with sequence-based dating when the lineages include the most ancient haplogroup A individuals. However, the times using slow mutated STR markers with genealogical rates fit well with sequence-base...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • American journal of human genetics

دوره 98 5  شماره 

صفحات  -

تاریخ انتشار 2016